Goto

Collaborating Authors

 Wharton County


GRAM: Global Reasoning for Multi-Page VQA

Blau, Tsachi, Fogel, Sharon, Ronen, Roi, Golts, Alona, Ganz, Roy, Avraham, Elad Ben, Aberdam, Aviad, Tsiper, Shahar, Litman, Ron

arXiv.org Artificial Intelligence

The increasing use of transformer-based large language models brings forward the challenge of processing long sequences. In document visual question answering (DocVQA), leading methods focus on the single-page setting, while documents can span hundreds of pages. We present GRAM, a method that seamlessly extends pre-trained single-page models to the multi-page setting, without requiring computationally-heavy pretraining. To do so, we leverage a single-page encoder for local page-level understanding, and enhance it with document-level designated layers and learnable tokens, facilitating the flow of information across pages for global reasoning. To enforce our model to utilize the newly introduced document-level tokens, we propose a tailored bias adaptation method. For additional computational savings during decoding, we introduce an optional compression stage using our C-Former model, which reduces the encoded sequence length, thereby allowing a tradeoff between quality and latency. Extensive experiments showcase GRAM's state-of-the-art performance on the benchmarks for multi-page DocVQA, demonstrating the effectiveness of our approach.


Egypt sets its sights on artificial intelligence

#artificialintelligence

Interest in artificial intelligence is on the rise in Egypt as enterprises embrace emerging technology to expand into new markets, investors back AI startups and government initiatives support education and awareness of the technology. There is mounting evidence that private enterprise is embracing AI. Recently, for example, AI and anlytics vendor fonYou partnered with a mobile operator in Egypt to use its AI module to reach the unbanked, and Widebot just raised a six-figure (USD) Pre-Series A investment for its Arabic language chatbot. Meanwhile, the government is looking to develop AI capabilities in a number of ways, including launching its first AI faculty at Kafr El Sheikh University. Egypt is aiming to have 7.7 percent of its GDP derived through AI by 2030, a figure touted in the PricewaterhouseCoopers (PwC) report, The Potential Impact of AI in the Middle East.